10 research outputs found

    Basins of Attraction, Commitment Sets and Phenotypes of Boolean Networks

    Full text link
    The attractors of Boolean networks and their basins have been shown to be highly relevant for model validation and predictive modelling, e.g., in systems biology. Yet there are currently very few tools available that are able to compute and visualise not only attractors but also their basins. In the realm of asynchronous, non-deterministic modeling not only is the repertoire of software even more limited, but also the formal notions for basins of attraction are often lacking. In this setting, the difficulty both for theory and computation arises from the fact that states may be ele- ments of several distinct basins. In this paper we address this topic by partitioning the state space into sets that are committed to the same attractors. These commitment sets can easily be generalised to sets that are equivalent w.r.t. the long-term behaviours of pre-selected nodes which leads us to the notions of markers and phenotypes which we illustrate in a case study on bladder tumorigenesis. For every concept we propose equivalent CTL model checking queries and an extension of the state of the art model checking software NuSMV is made available that is capa- ble of computing the respective sets. All notions are fully integrated as three new modules in our Python package PyBoolNet, including functions for visualising the basins, commitment sets and phenotypes as quotient graphs and pie charts

    Approximating attractors of Boolean networks by iterative CTL model checking

    Get PDF
    This paper introduces the notion of approximating asynchronous attractors of Boolean networks by minimal trap spaces. We define three criteria for determining the quality of an approximation: “faithfulness” which requires that the oscillating variables of all attractors in a trap space correspond to their dimensions, “univocality” which requires that there is a unique attractor in each trap space, and “completeness” which requires that there are no attractors outside of a given set of trap spaces. Each is a reachability property for which we give equivalent model checking queries. Whereas faithfulness and univocality can be decided by model checking the corresponding subnetworks, the naive query for completeness must be evaluated on the full state space. Our main result is an alternative approach which is based on the iterative refinement of an initially poor approximation. The algorithm detects so-called autonomous sets in the interaction graph, variables that contain all their regulators, and considers their intersection and extension in order to perform model checking on the smallest possible state spaces. A benchmark, in which we apply the algorithm to 18 published Boolean networks, is given. In each case, the minimal trap spaces are faithful, univocal, and complete, which suggests that they are in general good approximations for the asymptotics of Boolean networks

    Designing miRNA-Based Synthetic Cell Classifier Circuits Using Answer Set Programming

    Get PDF
    Cell classifier circuits are synthetic biological circuits capable of distinguishing between different cell states depending on specific cellular markers and engendering a state-specific response. An example are classifiers for cancer cells that recognize whether a cell is healthy or diseased based on its miRNA fingerprint and trigger cell apoptosis in the latter case. Binarization of continuous miRNA expression levels allows to formalize a classifier as a Boolean function whose output codes for the cell condition. In this framework, the classifier design problem consists of finding a Boolean function capable of reproducing correct labelings of miRNA profiles. The specifications of such a function can then be used as a blueprint for constructing a corresponding circuit in the lab. To find an optimal classifier both in terms of performance and reliability, however, accuracy, design simplicity and constraints derived from availability of molcular building blocks for the classifiers all need to be taken into account. These complexities translate to computational difficulties, so currently available methods explore only part of the design space and consequently are only capable of calculating locally optimal designs. We present a computational approach for finding globally optimal classifier circuits based on binarized miRNA datasets using Answer Set Programming for efficient scanning of the entire search space. Additionally, the method is capable of computing all optimal solutions, allowing for comparison between optimal classifier designs and identification of key features. Several case studies illustrate the applicability of the approach and highlight the quality of results in comparison with a state of the art method. The method is fully implemented and a comprehensive performance analysis demonstrates its reliability and scalability

    Beiträge zur Analyse von Qualitativen Modellen Genregulatorischer Netzwerke

    No full text
    This thesis addresses three challenges in modeling regulatory and signal transduction networks. Starting point is the generalized logical formalism as introduced by R. Thomas and further developed by D. Thieffry, E. H. Snoussi and M. Kaufman. We introduce the fundamental concepts that make up such models, the interaction graph and the state transition graph, as well as model checking, a computer science technique for deciding whether a finite transition system satisfies a given temporal specification. The first problem we turn to is that of whether a given model is consistent with time series data. To do so we introduce query patterns that can be automatically derived from discretized data. Time series data, being such an abundant source of information for reverse engineering, has previously been used in the context of logical models but only under the synchronous, transition-based notion of consistency. The arguably more realistic asynchronous transition relation has so far been excluded from such data driven reverse engineering, probably because the corresponding non-determinism in the transition system introduces additional obstacles to the already hard problem. Our contribution here is a path-based notion of consistency between model and data that works for any transition relation. In particular, we discuss linear time properties like monotony and branching time properties like robustness. The result are several query patterns, similar to but more complex than the ones proposed by P. T. Monteiro et al. A toolbox, called TemporalLogicTimeSeries for the automated construction of queries from data is also presented. The second problem we turn to concerns the two types of long-term behaviors that logical models are capable of producing: steady states, in which the activity levels of all network components are kept at a fixed value, and cyclic attractors in which some components are unsteady and produce sustained oscillations. We attempt to understand the emergence of these behaviors by searching for symbolic steady states as defined by H. Siebert. Our main contribution is the introduction of the prime implicant graph, which describes all minimal conditions under which components may change their activities, and an optimization-based algorithm for the enumeration of all maximal and minimal symbolic steady states. Essentially, we generalize the canalizing effects and forcing structure that were first introduced and studied by S. Kauffman and F. Fogelman in the context of random Boolean networks. The chapter includes a theorem that relates symbolic steady states to the existence of positive feedback circuits in the interaction graph. A toolbox, called BoolNetFixpoints that implements our algorithm is also described. The theme of the last chapter is how to deal with uncertainties that inevitably appear during the modeling of biological systems. One is often forced to resolve them since most types of analysis require a single, fully specified model. The knowledge gap is usually filled by making simplifications or by introducing additional assumptions that are hard to justify and therefore somewhat arbitrary. The alternative is to work with and analyze sets of alternative models, rather than single models. This idea entails additional theoretical and practical challenges: With which language should we describe our partial knowledge about a system? How can predictions be made given that each model in the set may behave differently? How can hypotheses and additional data be added to the current knowledge in a systematic manner? It seems that there are in principle two different approaches. The first one is constraint-based and studied by F. Corblin et al. It translates the partial knowledge and modeling formalism into facts and rules of a logic program. Common solvers can then deduce additional properties or test the validity of given queries across all models. In contrast, we propose to study the pros and cons of an explicit approach that enumerates all models that agree with a given partial specification. During the first step, models are enumerated and stored in a database. During a second step, models are annotated with additional information that is obtained from custom algorithms. The relationships between the annotations are then analyzed in a third step. The chapter is based on the prototype implemention LogicModelClassifier that performs the discussed steps. Throughout, we apply our results to two previously published models of biological systems. The first one is a small model of the galactose switch which regulates the transcription of genes that are involved in the metabolism of yeast. We address questions that arise during the construction of the model, for example the number of involved components and their interactions, as well as issues related to model validation and model revision with time series data. The case study also discusses different approaches to data discretization. The second one is a medium size model of the MAPK network studied by D. Thieffry et al. that is used to predict the cell fate response to different stimuli involving the growth factors EGF, TGFB, FGF and DNA damage. With the methods developed in this thesis we can prove that the model is capable of 18 different asymptotic behaviors, 12 of them steady states and 6 cyclic attractors. The question of which attractor is reached from which initial state is answered and we can show that the response in terms of proliferation or growth arrest and apoptosis is fully determined by the input stimulus.Diese Arbeit beschäftigt sich mit drei Herausforderungen, die beim Modellieren von regulatorischen Netzwerken und der Signaltransduktion auftreten. Zunächst beschreiben wir den logischen Formalismus, der von R. Thomas eingeführt und D. Thieffry, E. H. Snoussi und M. Kaufman weiterentwickelt wurde. Er zeichnet sich dadurch aus, dass die Komponenten des Modells nur Werte aus einem endlichen Bereich annehmen. Wir stellen die grundlegenden Objekte eines logischen Modells, den Zustandsübergangsgraphen und den Interaktionsgraphen, vor und besprechen das Model Checking, eine Methode zur automatischen Prüfung von Ausdrücken temporaler Logiken in gegebenen Modellen. Der erste Teil der Arbeit beschäftigt sich damit, wie wir entscheiden können, ob ein gegebenes Modell mit Zeitreihendaten konsistent ist. Dazu konstruieren wir verschiedene Anfragen nach denen Daten in temporale Logiken übersetzt werden können. Zeitreihendaten spielen eine wichtige Rolle beim Reverse Engineering von logischen Modellen nach Daten, aber bisher nur unter der Annahme, dass die Übergänge des dem Modell zugrundeliegenden Übergangssystems synchron sind. Die realistischere Annahme, nämlich dass sich die Aktivitäten der Komponenten asynchron ändern, wurde bisher in diesem Zusammenhang nicht untersucht. Das liegt wahrscheinlich daran, dass die dadurch entstehenden nicht- deterministischen Übergangssysteme ein ohnehin schon schwieriges Problem noch weiter verkomplizieren. Unser Beitrag in diesem Zusammenhang sind verschiedene pfadbasierte Definitionen von Konsistenz, die unabhängig von der gewählten Übergangsrelation prüfbar sind. Wir diskutieren die Möglichkeit Monotonie- und Robustheits-Annahmen mithilfe von Linear Time Logic und Computational Tree Logic zu kodieren. Außerdem wird die Toolbox "TemporalLogicTimeSeries" zur automatischen Generierung der besprochenen Anfragen vorgestellt. Im zweiten Teil wenden wir uns dem Langzeitverhalten und den Attraktoren von logischen Modellen zu. Wir versuchen die Existenz von stabilen Zuständen, in denen die Aktivitäten aller Komponenten konstant bleiben, und auch von zyklischen Attraktoren, in denen einige Komponenten dauerhaft instabil sind, mithilfe der sogenannten symbolischen Fixpunkte zu erklären. Die Ergebnisse beziehen sich dabei auf die Definitionen von H. Siebert. Es werden die Prim-Implikanten, als minimale Bedingungen unter denen diskrete Funktionen ihren Wert ändern können, eingeführt und der Prim-Implikanten-Graph vorgestellt. Das zentrale Ergebnis ist, dass symbolische Fixpunkte durch bestimmte Kantenmengen in diesem Graphen repräsentiert werden. Diese können durch 0-1 Optimierungsprobleme beschrieben und mithilfe von üblichen Constraint-Solvern gefunden werden. Ein Skript, das alle beschriebenen Schritte durchführt, ist unter dem Namen "BoolNetFixpoints" verfügbar. Im letzten Teil der Arbeit beschäftigen wir uns mit Ungewissheiten, die während des Modellierens biologischer Systeme zwangsläufig auftreten. Oft ist man gewzungen diese auszuräumen, da die meisten Analysemethoden vollständig spezifizierte Modelle benötigen. Das geschieht oft dadurch, dass starke Vereinfachungen gemacht oder schwer zu begründende, und damit willkürliche, Annahmen getroffen werden müssen. Die Alternative dazu besteht darin gleichzeitig mit allen Modellen zu arbeiten, die dem aktuellen Stand des Wissens entsprechen. Dadurch entstehen zusätzliche theoretische und praktische Herausforderungen: Mit welcher Sprache können Modelle teilweise spezifiziert werden? Wie lassen sich Vorhersagen treffen, wenn sich jedes Modell potenziell anders Verhalten kann? Wie können zusätzliche Annahmen und Daten möglichst systematisch hinzugefügt werden? Im Prinzip gibt es zwei Herangehensweisen. Der Constraint-Programming Ansatz, umgesetzt von F. Corblin et al., übersetzt das vorhandene, partielle Modell sowie den Modell-Formalismus in Fakten und Regeln eines logischen Programms. Übliche Logic Programming Solver können dann prüfen ob sich eine Eigenschaft aus diesem Programm herleiten läßt, oder nicht. Im Gegensatz dazu untersuchen wir die Vor- und Nachteile eines expliziten Ansatzes. Dabei werden alle Modelle, die mit einer gegebenen Spezifikation konsistent sind, aufgezählt und in einer Datenbank gespeichert. In einem zweiten Schritt können die Modelle mit zusätzlichen Informationen versehen werden, deren Beziehungen zueinander dann in einem dritten Schritt ausgewertet werden. Das Kapitel orientiert sich an der prototypischen Umsetzung "LogicModelClassifier" mit der die besprochenen Schritte ausgeführt werden können. Die entwickelten Methoden und Ideen werden an zwei Modellen illustriert. Das erste ist ein kleines Modell des Galaktose-Genschalters in Hefe welcher am Stoffwechsel beteiligt ist. Es werden Fragen behandelt die sich beim Aufstellen des Modells stellen, zum Beispiel wieviele Komponenten gebraucht werden und wie diese interagieren sollen. Des Weiteren wird die Modell-Validierung und Revision mit Hilfe von Expressionsdaten angesprochen. Verschiedene Herangehensweisen zur Diskretisierung der Daten werden miteinander verglichen. Das zweite ist ein größeres Modell des MAPK Systems, welches das Schicksal von Krebszellen in Abhängigkeit von verschiedenen Umwelteinflüssen beschreibt. Zu den Einflüssen zählen die Wachstumsfaktoren EGF, TGFB und FGF sowie DNS-Schäden. Mit den in der Dissertation erarbeiteten Methoden und Ideen können wir zeigen, dass das Model in der Lage ist 18 verschiedene Reaktionen zu zeigen. 12 davon sind stabile Zustände und 6 sind zyklische Attraktoren. Die Frage welcher Attraktor von welchem Anfangszustand erreicht werden kann wird beantwortet und wir können zeigen, dass das asymptotische Verhalten des Modells, in Bezug auf die Entscheidung Zellwachstum oder Zelltod, vollständig durch die Anfangsbedingungen bestimmt ist

    Presentation_1_Designing miRNA-Based Synthetic Cell Classifier Circuits Using Answer Set Programming.PDF

    No full text
    <p>Cell classifier circuits are synthetic biological circuits capable of distinguishing between different cell states depending on specific cellular markers and engendering a state-specific response. An example are classifiers for cancer cells that recognize whether a cell is healthy or diseased based on its miRNA fingerprint and trigger cell apoptosis in the latter case. Binarization of continuous miRNA expression levels allows to formalize a classifier as a Boolean function whose output codes for the cell condition. In this framework, the classifier design problem consists of finding a Boolean function capable of reproducing correct labelings of miRNA profiles. The specifications of such a function can then be used as a blueprint for constructing a corresponding circuit in the lab. To find an optimal classifier both in terms of performance and reliability, however, accuracy, design simplicity and constraints derived from availability of molcular building blocks for the classifiers all need to be taken into account. These complexities translate to computational difficulties, so currently available methods explore only part of the design space and consequently are only capable of calculating locally optimal designs. We present a computational approach for finding globally optimal classifier circuits based on binarized miRNA datasets using Answer Set Programming for efficient scanning of the entire search space. Additionally, the method is capable of computing all optimal solutions, allowing for comparison between optimal classifier designs and identification of key features. Several case studies illustrate the applicability of the approach and highlight the quality of results in comparison with a state of the art method. The method is fully implemented and a comprehensive performance analysis demonstrates its reliability and scalability.</p
    corecore